Mon. Not. R. Astron. Soc. 000, 000-000 (0000) Printed 26 June 2008 (MN M£X style file v2.2) 



A Hybrid TV-Body Code Incorporating Algorithmic 
Regularization and Post-Newtonian Forces 
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ABSTRACT 

We describe a novel ./V-body code designed for simulations of the central regions of 
galaxies containing massive black holes. The code incorporates Mikkola's "algorith- 
mic" chain regularization scheme including post-Newtonian terms up to PN2.5 order. 
Stars moving beyond the chain are advanced using a fourth-order integrator with 
forces computed on a GRAPE board. Performance tests confirm that the hybrid code 
achieves better energy conservation, in less elapsed time, than the standard scheme 
and that it reproduces the orbits of stars tightly bound to the black hole with high 
precision. The hybrid code is applied to two sample problems: the effect of finite- N 
gravitational fluctuations on the orbits of the S-stars; and inspiral of an intermediate- 
mass black hole into the galactic center. 

Key words: Galaxy: centre - stellar dynamics - methods: iV-body simulations 



1 INTRODUCTION 



Standard iV-body integrators have difficulty reproducing the 
motion of tight binaries, or following close hyperbolic en- 
counters between stars. Unless the integration time step is 
made very short, the relative orbit will not be accurately 
reproduced, and the system energy may also exhibit an un- 
acceptable drift since a single compact subsystem can con- 
tain a large fraction of the total binding energy. This dif- 
ficulty is particularly limiting for studies of the centers of 
galaxies like the Milky Way, where the gravitational poten- 
tial is dominated by a (single or binary) supermassive black 
hole. Early attempts to simulate such systems often had 
diffi culty achieving the req uired accuracy and performance 
fe.g. lSigurdsson et alj|l99a h In later studies, stars in tightly 
bound orbits around the black hole were sometimes simply 
remo ved (e.g. iBaumgardt et all 12004 : iMatsubavashi et al.l 
2007); in other studies, the position and velocity of the 
most massive particle were artificially fixed and the stel- 
lar orbits approximated as pert urbed Keplerian ellipses (e.g. 
lLockmann fc BaumgardtfeOQSl ) . While these approaches can 
be useful for certain problems, a less restrictive treat- 
ment is indicated for studies in which the motion of stars 
near the black hole is used to constrain the magnitude 
of non-Keplerian pe r turbations (|Fragile fc Mathews! |2000| ; 
IWeinberg et~ai1l2005l ; IWillll200Sl) . Fixing the location of the 
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most massive particle can also be problematic when simu- 
lating binary or multiple supermassive black holes. 

lAarsetbl (|2003bh summarizes the various regularized 
schemes that have been incorporated into N-body codes to 
treat strong gravitational interactions with high accuracy 
and without loss of performance. 

(1) T he KS-regularization method l|Kustaanheimo fc Stiefell 
1 1965T) . together w i th t he original chain method 
1 Mikkola fc Aarsetbl 1 19931 ) which uses the KS- 
transformation to regularize multiparticle systems, has 
been widely used to integrate binaries in simulations of 
star clusters. This method ha s also been applied to the 
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binary black hole problem (e.g . 
iMilosavlievic fc Merrittl l200li ; 
approach suffers from inaccuracies when mass ratios are 
very large because the total energy that appear s in the 
equati ons of motion is dominated by the binary (|Aarsethl 
l2003al ). 

(2) In the case of a single dominant body, an alternative to 
the chain geometry, called wheel-spoke (WS) regularization, 
treats each interaction with the massive body via the stan- 
dard KS-regularization, while o ther interactions use a sm all 
softening to avoid singularities rtZar el l 1974 lAarsetrJ[2007l ). 

(3) Algorithmic regularization (AR) methods effectively re- 
move singularities with a time transformation accompanied 
by the leapfrog algorithm. There ar e two such methods: 
the log ar ithmic Hamiltonian (LogH ) (|Mikkola fc Tanikawal 
Il999bl lal; iPreto fc Tremainel 1 19991 ) and time-transformed 
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leapfrog (TTL) (jMikkola fc Aarsethll20o3 ) . Both these algo- 
rithms produce exact trajectories for the unperturbed two- 
body problem and provide regular results for more gen- 
eral cases. In the LogH method the derivative of time is 
inversely proportional to the gravitational potential while 
in TTL the potential is replaced by the sum of all inverse 
distances. Contrary to the KS-chain, zero masses do not 
cause any singularity in these AR-methods; TTL even pro- 
vides equal weight to all the members of the subsystem, 
and thus these methods are well suited for integration of 
systems with la rge mass ratios. Use of th e generalized mid- 
point method ijMikkola fc Merrittl 120061 ') allows velocity- 
dependent terms, e.g. the post-Newtonian expansion. In ad- 
dition to the leapfrog and/or generalized mid-point method 
one must use the Bulirsch-Stoer (BS) extrapolation method 
for high accuracy. The basic algorithms provide the cor- 
rect symmetry for the B S method to work efficie ntly. In 
a recent implementation l)Mikkola fc Merrittl 1200^ 1. called 
AR-CHAIN, the chain st ructure, introduced originally by 
iMikkola fc Aarsethl (|l993l ). is incorporated with a new time- 
transformation that combines the advantages of the LogH 
and TTL and generalized mid-point methods. The chain 
structure significantly reduces the roundoff error and the 
other transformations provide regular data, necessary to 
achieve high precision, for the BS-extrapolator. The AR- 
CHAIN code also includes post-Newtonian terms to order 
PN2.5 



In this paper we describe the performance of a new, hy- 
brid A^-body code that incorporates AR-CHAIN. The new 
code, called ipGRAPEch, is based on (the serial version 
of) c^GRAPE, a general-purpose, direct-summation Af-body 
code which uses G RAPE special-pur pose hardware to com- 
pute accelerations (|Harfst et alj2007h . The new code divides 
particles into two groups: particles associated with the mas- 
sive object (or objects) and that are included in the chain, 
and particles outside the chain that are advanced via the 
Hermite scheme of (/?GRAPE. Some of the latter particles 
are denoted as perturbers and are allowed to affect the mo- 
tion of stars in the chain, and vice versa. After describing 
the hybrid code (321, we present the results of performance 
tests based on a model that mimics the star cluster around 
a supermassive black hole (SJ3J). We show that the hybrid 
code can achieve higher overall accuracies (as measured via 
energy conservation, say) than <^GRAPE alone, and in less 
elapsed time, in spite of the additional overhead associated 
with the chain. 



2 THE HYBRID A-BODY CODE 

In this section we describ e how the AR-CHAIN algorithm of 
Mikk oia fc Merrittl (|2008l ) was integrated into t he serial ver- 
sion of the direct-summation code (ysGRAPE l|Harfst et al.l 
2007). T he latter algorithm emplo ys a Hermite integration 
scheme l|Makino fc Aarsethl |l992) with hierarchical, com- 
mensurate block time steps and uses a GRAPE board to 
calculate forces. 

We begin by briefly describing the integration scheme 
without the chain and its implementation using the GRAPE. 



2.1 Integration scheme 

In addition to position Xj, velocity V;, acceleration a,*, and 
time derivative of acceleration a; , each particle i has its own 
time U and time step AU. 

Integration consists of the following steps: 

(1) The initial time steps are calculated from 

I ail 



AU 



(1) 



where typically r] s = 0.01 gives sufficient accuracy. 



(2) The system time t is set to the minimum of all ti + Ati, 
and all particles i that have i* + AU = t are selected as active 
particles. 

(3) Positions and velocities at the new t are predicted for 
all particles using 

(t — tj) 2 (t-t-) 3 
Xj.p = Xj,o + (t - *j)Vj, + 7T— a J.o + 



v j.p = ^j,o + {t-tj)a. jfi + - — jr^a/.o- 



(2b) 



Here, the second subscript denotes a value given either at 
the beginning (0) or the end (1) of the current time step. All 
quantities used in the predictor can be calculated directly, 
i.e. no memory of a previous time step is required. 

(4) Acceleration and its time derivative are updated for 
active particles only according to 



a;,i 



= Gm i 



J (r ! 2 , + £ 2 ) (3/2)' 



a,,i = 



where 



+ 



3(v 



( r 2 +e 2)(3/2) ( r 2 +£ 2)(5/2) 



(3a) 
(3b) 



In [|4]we apply ^GRAPEch to the integration of a re- 
alistic, multi-component model of the Galactic center. In 
the first application ( H4.ip we accurately evaluate, for the 
first time, the effects of perturbations from stars and stel- 
lar remnants on the orbital elements of bright stars ob- 
served on short-period orbits about Sagitt arius A* (S-stars: 
iGhez et al.l 120031 : lEisenhauer et al.1 120051 ). We then f p~2j) 
present an integration of the orbit of an intermediate-mass 
black hole as it spirals into the galactic center via the com- 
bined influence of dynamical friction and gravitational-wave 
energy loss. 



and e is the softening parameter. 



(4a) 
(4b) 



(5) Positions and velocities of active particles are cor- 
rected using 



,( 2 ) 



Ail (3) 

120 a M" 



X M — Xi .p + 24 ai .° 

_L A ^ (2) , 

Vi,i = v,, p + — &l> + ^-a i0 



,(3) 



(5a) 
(5b) 



where the second and third time derivatives of a are given 
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by 



-6 (ai,o - ai,i) — Atj (4a.i,o + 2ai,i) 



l -° " Aif 
(3) _ 12 (a», - a M ) + 6At, (a», + a,,i) 
~ At? 



(6a) 
(6b) 



(6) The times ti are updated and the new time steps Ati 
are determine d. Time steps are calculated using the stan- 
dard formula (|Aarsethlll985h : 



Ati A 



Ki||aS| 



a 4 .i 



(7) 



The parameter r\ controls the accuracy of the integration and 
is typically set to 0.01 (although the use of smaller value of 
7] is described below). The value of a^ { is calculated from 



a i,0 + £iI »,0 a i,0 



and a' 3 ^ is set to Eifg. 

(7) Repeat from step (2). 

A hierarchical commensurate block time step scheme 
is necessary when the Hermite integrator is used with the 
GRAPE (and is also effic ient for p a ralleli zation and vec- 
torization; see below and McMiUanj (1986)). Particles are 
grouped by replacing their time steps Ati with a block time 
step Atj,b = (l/2) n , where n is chosen according to 



(9) 



The commensurability is enforced by requiring that t/AU 
be an integer. For numerical reason we also set a minimum 
time step At m i n , where typically 



A£ m in — 2 



(10) 



with m ^ 28. The time steps of particles with AU < A£ m i n 
are set to this value. The minimum time step should be con- 
sistent with the maximum acceleration defined by the soften- 
ing parameter; monitoring of the total energy can generally 
indicate whether this condition is being violated. 



(4) Predict positions and velocities for all other par- 
ticles on the GRAPE, and calculate forces and their time 
derivatives for active particles. 

(5) Retrieve forces and their time derivatives from the 
GRAPE and correct positions and velocities of active 
particles on the host. 

(6) Compute the new time steps and update the parti- 
cle data on the host of all active particles in the GRAPE 
memory. 

(7) Repeat from step (2). 



2.3 Integrating AR-CHAIN into ^GRAPE 

Here we describe the hybrid A^-body code. We first describe 
the step-by-step algorithm as in the previous section, then 
provide a more detailed explanation of the most important 
steps. 

(1) Initialize the GRAPE and send particle data 
(positions, velocities, etc.) to GRAPE memory. 

(2) Compute the next system time t and select active 
particles on the host (same as step 2 in previous section). 

(3) If the chain is not active, check the active particles 
to see whether the chain should be used. If the chain is 
active, check if particles enter or leave the chain. 

(4) If the chain is not needed, standard integration can 
continue from step (14). Otherwise, (re-) initialize the 
chain if a new chain has been created or an existing chain 
has changed. 

(5) Find all particles within a sphere of radius r pC rt 
centered on the center-of-mass (COM) chain particle and 
define them as perturbers. 

(6) Compute forces on the COM particle (if the chain 
is new or has changed). 



2.2 GRAPE implementation 

The GRAPE-6 and GRAPE-6A hardware has been designed 
to work with a Hermite integration scheme and is therefore 
easily integrated into the algorithm described in the previ- 
ous section f see iMakino et all 120031 ) . In detail, integration 
of particle positions using the GRAPE-6A consists of the 
following steps: 

(1) Initialize the GRAPE and send particle data 
(positions, velocities, etc.) to GRAPE memory. 

(2) Compute the next system time t and select active 
particles on the host (same as step 2 in previous section). 

(3) Predict positions and velocities of active particles 
only and send the predicted values together with the new 
system time t to GRAPE's force calculation pipeline. 



(7) Evolve the chain one time step. 

(8) Predict the position and velocity of the COM chain 
particle, compute forces and correct its position and 
velocity. 

(9) Resolve the chain and update positions and veloc- 
ities of both the chain COM particle and the individual 
chain particles. 

(10) Integrate all active particles; particles within r Tes 
feel the forces from the resolved chain (i.e. individual chain 
particles) and particles outside r ros feel only the forces from 
the chain COM particle. 

(11) Compute new time steps for the active particles. 

(12) Update the particle data on the host CPU and the 
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GRAPE for all active particles. 

(13) Repeat from step (2). 

(14) Predict positions and velocities of active particles 
only and send the predicted values together with the new 
system time t to the GRAPE force calculation pipeline. 

(15) Predict positions and velocities for all other par- 
ticles on the GRAPE, and calculate forces and their time 
derivatives for active particles. 

(16) Retrieve forces and their time derivatives from 
the GRAPE and correct positions and velocities of active 
particles on the host. 

(17) Compute the new time steps and update the 
particle data on the host of all active particles in the 
GRAPE memory. 

(18) Repeat from step (2). 

In the remainder of this section, we describe in detail 
the implementation of the hybrid code. For purposes of dis- 
cussion, we assume a test case where a very massive particle 
is located near the center of the system, e.g. a supermassive 
black hole in the galactic center. But the same algorithm 
can be easily applied to other configurations as well. 

The hybrid code is initialized in the manner described 
above. After the time steps for all particles have been de- 
termined, a check is made whether to use the chain. Two 
parameters control the assignment of particles to the chain: 
tcrit and r cr i t . Initially, any particle that satisfies both crite- 
ria: 

Afc < t crit (11a) 
An,BH < r crit (lib) 

is selected as a chain particle, where t^bh is the distance 
of particle i to the massive particle. Once the chain is acti- 
vated and a chain radius (as defined below) is determined, 
all particles within r c h are assigned to the chain, i.e. t cr it is 
ignored. If two or more particles are selected as chain par- 
ticles the chain is activated; the massive particle is always 
designated a chain particle. 

If the chain is not needed the code continues with the 
standard A^-body integration. Otherwise a new chain is cre- 
ated (step 4). The chain radius r c h is set to the largest rj,BH 
found in the previous step and all particles within r c h are 
added to chain independent of their time step. Any chain 
particle with U t sys , where t sys is the current system time, 
is integrated to this time. Then a pseudo chain particle is 
created with the COM coordinates of the chain particles. 
The chain particles are also removed from the list of active 
particles and their masses are set to zero on the GRAPE. 

All particles within r per t are selected as perturber par- 
ticles (5) and the perturber radius is calculated from 

r ^ t = [jM7j xL5rcrit ' (12) 

where m* is the mass of a particle and M c h the total mass 
in the chain. The parameter F controls the accuracy of the 
calculation of forces on the chain. Typically, F = 10~ 6 is 



used. In principle r per t should be proportional to r c h. How- 
ever, r c h can sometimes change significantly during a chain 
step and it turned out that using 1.5r cr it results in a much 
smaller integration error (the additional factor of 1.5 is due 
to the condition for particles to leave the chain). It also al- 
lows the perturber list to be re-used for several steps making 
the algorithm more efficient. Perturber particles act on the 
resolved chain as described below. 

The force on the COM particle is calculated (6) in two 
steps. First, the force of the non-perturbers (i.e. particles 
outside of r port ) on the COM particle is computed. This 
is done by loading the COM particle to the GRAPE and 
setting the masses of all perturber particles to zero at the 
same time. Then, a standard GRAPE call is used to compute 
the force, and the masses of the perturbers are set back 
to normal. In the second step, the forces of the perturbers 
on each chain particle are computed on the host. (In our 
applications, the number of perturbers is usually less than 
100. Some problems may require a much larger number of 
perturbers, in which case it would be more efficient to use 
the GRAPE to do this force calculation as well.) The force 
on the individual chain particles is summed up according to 

acOM.pcrt = Jj— m »ch a ich ( 13 ) 

to give the total force on the COM particle (and likewise for 
the force derivatives). 

In the next step (7), the chain particles are advanced for 
one system time step (which is the shortest time step needed 
by a non-chain particle and ideally longer than the shortest 
regular time step of the chain particles) . This is achieved by 
sending the masses, positions and velocities of the chain and 
perturber particles plus the forces and force derivatives of 
the perturbers to AR-CHAIN. The COM of the chain can 
change during this step because the chain is perturbed but 
any change is subtracted at this point. 

Now a predictor-corrector step is made for the COM 
particle (step 8). Using the force calculated in step (6), po- 
sition and velocity are predicted with equation (2) , and these 
are then used to recompute the force on the COM particle 
in the same way as described above. The corrected position 
and velocity can then be calculated by equation (5) . 

The new position of the COM particle, together with 
the new positions and velocities of the chain particles calcu- 
lated in the chain (7) can be used to resolve the chain, thus 
the true positions and velocities at the end of the current 
time step can now be written to the memory and GRAPE 
(including the COM particle). With the resolved chain it 
is now possible to compute the forces for the active parti- 
cles. This again is done in two steps for particles outside 
and inside of r rcs = F _1 / 3 7 , c h: First all active particles out- 
side of r rcs are integrated with the chain particles replaced 
by the COM particle. Then the active particles inside r res 
are integrated seeing the resolved chain and not the COM 
particle. This is done by removing the COM particle from 
the GRAPE and by setting the masses of the chain particles 
back to normal. 

Finally, a new time step can be computed for all active 
particles (note that the chain particles are not counted as 
active but everything in the chain is integrated in every step 
nonetheless). Positions and velocities are also updated on 
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both the host and the GRAPE. Then, the integration can 
continue with the next step (going back to (2)). 

If the chain is already active at the beginning of the 
time step, non-chain particles are checked to see whether 
they have entered the chain. A particle enters the chain if 
its distance to the chain COM is smaller than the chain ra- 
dius r C rit- Any particle that enters the chain is, if needed, 
syncronized to t sys first. Particles leave the chain if their dis- 
tance to the COM becomes larger than 1.5r cr it. The factor 
of 1.5 ensures that particles do not enter and leave the chain 
too often (to avoid the overhead associated with extra inter- 
nal initialization each time chain membership is changed). A 
particle leaving the chain has its force, derivative and time 
step initialized and this information is sent to the GRAPE 
when the mass of this particle is set back to normal. The ini- 
tialization of the time step is particularly important: errors 
are easily introduced into the integration if the time step is 
chosen too large. We use Eq. ^ with a rather small value 
for r)s, e.g. r)s = 10~ 2 r]. 

The search of perturber particle in step (5) is computa- 
tionally rather expensive and is therefore not done in every 
step. A parameter At pclt is used to determine how often the 
list of perturbers is renewed (note that only the list of per- 
turbers is unchanged for some time, current position etc. are 
still used within the chain). In addition, the perturber list is 
renewed whenever chain membership is changed. Also, the 
force on the COM particle only needs an update if the chain 
membership has changed. 

Finally, an extra GRAPE call is needed to compute the 
potential for chain particles if at the end of a step if output 
of energies, etc. is required. 



3 RESULTS OF PERFORMANCE TESTS 

We tested the performance of the hybrid code using various 
realizations of a model designed to mimic the density profile 
of the star cluster around the Milky Way supermassive black 
hole (Fig. [I}. The model has a mass density profile 

P(r) = /9 °(i) 32 ™P[-Hr/Re) yn ]- (14) 

This is a p ~ r ~ 3 / 2 power-law near the center, similar to 
what is observed in the inner ~ 0.1 pc of the Milky Way 
|Schodel et al.ll2007l ). An Einasto-like truncation was ap- 
plied to give the model a finite total mass; the Einasto index 
was n — 2 and we adjusted 6 such that the truncation begins 
at a radius of ~ 0.1R e . The total mass in stars was fixed to 
be equal to that of the BH particle, i.e. all "star" particles 
have mass m = M./N . Henceforth units are adopted such 
that G = M, — R e — 1; the model can be scaled approxi- 
mately to the Galactic center by setting the length unit to 
lpc and the mass unit to 3 x 1O 6 M0, making the unit of 
time ~ 10 4 yr. 

Initial positions and velocities for the N "star" parti- 
cles were generated as follows. (1) Eddington's formula was 
used to compute the unique, isotropic phase-space distribu- 
tion function fi{E), i = 1, 4 that reproduced the adopted 
pi(r) of each species in the combined gravitational poten- 
tial of the BH and the stars. (2) Distances r from the BH 
were generated by sampling randomly from the integrated 
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Figure 1. Density and cumulative mass profiles of the model 
used in the performance tests. The density obeys p oc r~ 1,5 near 
the center (dashed line) and is truncated via Einasto's law beyond 
r = 0.1. The total mass in stars equals that of the "black hole" 
particle. This model can be roughly scaled to the Galactic center 
star cluster by setting the length unit to 1 pc and the total stellar 
mass to 3 X 1O 6 M . 



mass profiles, and (x,y,z) coordinates were assigned by se- 
lecting random positions on each sphere of radius r. (3) The 
(isotropic) velocity distribution at this radius was computed 
from $(r) and f(E) = f[v 2 /2 + $]. (4) The magnitude of 
the velocity was selected randomly from this function, and 
the Cartesian velocity components were generated in a man- 
ner analogous to the position coordinates. Unless otherwise 
specified, performance tests were carried out on single nodes 
of gravitySimulato r, a 32-node clust er with a GRAPE6-A 
card on each node |Harfst et al.ll2007l ). 

Figure [2] shows the integration time as a function of r\ 
for various TV and for fixed r cr i t = 4 x 10 -4 . For this value 
of r cr i t , the typical number of particles in the chain, at any 
given time, is roughly linear in the total particle number. 
The number of particles in the chain is also small enough 
so that the total execution time is dominated by integration 
of particles outside the chain. This can be seen in Fig. [5] 
which shows that the scaling with the number of particles is 
approximately N 2 (the scaling in the chain-dominated case 
is discussed below). 

Figures and U show the energy conservation and 
elapsed time for integrations until t — 1 for the case N = 10 4 
and for various values of rj, the accuracy parameter in the 
Hermite integrator (eq. [TJ, and r CT it, the maximum distance 
from the black hole at which a particle enters the chain 
(eq. Illb|) ; t cr it (eq. Illb|) was fixed at 5 x 10 -5 , the soften- 
ing parameter was e = 10 pc and post-Newtonian terms 
were not included. (Including the PN terms was found to 
affect the speed of the code only very slightly; they were 
omitted in order to simplify the discussion of energy conser- 
vation.) Also shown is the performance of ipGRAPE without 
the regularized chain. The figures show the expected scaling 




Figure 2. Elapsed time as a function of N and r\ in integrations 
until time t = 1, and with r cr ; t = 4 X 10 — 4 . 
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Figure 3. Energy conservation in integrations until t = 1 (~ 10 4 
yr) of the model illustrated in Fig.[T]with 10 4 particles. Black line 
(asterisks) are for i/jGRAPE without the regularized chain. 



Figure 4. Elapsed time for the integrations of Fig. [3] 
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Figure 5. TV-dependence of the elapsed time for integrations until 
t = 1 of the model in Figure [T] with r] = 0.01. As the mean 
number of particles in the chain increases, the TV-scaling changes 
from ~ N 2 (dotted line) to ~ N 3 (dashed line). 



of the Hermite scheme with the accuracy parameter: time 
steps increase linearly with r\ making the integration faster 
and less accurate. For a fourth-order scheme, the relative 
energy error scales as ~ dt 5 ~ J? 5//2 , though in the case of 
the hybrid code the relation is modified by the presence of 
the chain. Energy conservation generally improves for larger 
values of r cr u, as more and more particles are removed from 
the A^-body integration and are treated more accurately in 
the chain. The integration time increases rapidly with r cr i t 
reflecting the ~ n 3 h dependence of the chain. Nevertheless it 



is clear that for all values of rj, there exist values of r cr i t such 
that the hybrid code is both more accurate, and faster, than 
<^GRAPE alone. This is presumably because the additional 
computational overhead associated with the regularization 
scheme is more than compensated for by the longer mean 
time steps of particles outside the chain. 

The A^-scaling of the integration time is more apparent 
in Fig. [S] When r cr i t is small, most of the computation is 
spent on the non-chain particles and the performance scales 
as ~ N 2 . As r c r it increases, so does the typical number 
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Figure 6. Average number of chain particles for integrations until 
t = 1 of the model in Figure [T] with 77 = 0.01. 



rich of particles in the chain, and the JV-scaling of the to- 
tal integration time changes to ~ n 3 h ~ A 3 . Along the line 
r cr i t = 2 x 1CP 3 in Figure [5] which roughly marks the tran- 
sition from an ~ A^ 2 to an ~ A 3 scaling, the mean fraction 
rich/N of particles in the chain varies from ~ 3 x 10 -4 to 
~ 7 x 10~ 4 . If we postulate a performance model in which 
the total integration time is simply the sum of the time spent 
on particles in the chain, tck = v4n 3 h , plus the time spent 
on particles outside the chain, ijv = BN 2 , then the fraction 
(n c h/Af) cr it at which the chain begins to dominate the total 
time should scale as A -1 / 3 . We did not attempt to verify this 
prediction in detail, but if we adopt (n c h/Af) C rit ~ 5 x 1CP 4 
at N = 10 4 , the model predicts (n ch /A) crit ~ 0.01A" 1/3 . 
This relation can be taken as defining the effective upper 
limit to the number of particles to include in the chain; for 
N = 10 6 , n ch , cri t w 100. 

The average fraction of chain particles in the reported 
integrations spans the range (3 — 7) x 10 -4 . This reflects 
a A-dependence of the average number of chain particles, 
as can be seen in Fig. \6\ In these test runs, we set a hard 
maximum number of chain particles of n c h, cr it = 50. 

The results presented in this section constitute a par- 
ticularly severe test of the code, since all of the particles are 
moving, at all times, in the essentially Keplerian potential 
of the central mass. In many other applications, the sphere 
of influence of the massive particle(s) would only include a 
fraction of the other particles in the simulation, and a choice 
of r cr it could be made that included all the particles in this 
region without forcing n c h to unreasonable values. 



4 APPLICATIONS 

4.1 Evolution of Orbits near the Galactic Center 
Black Hole 

AR-CHAIN conserves the Keplerian elements of unper- 
turbed two-body orbits with extremely high precision, even 
for arbitrarily large mass ratios, and this fact makes the hy- 
brid code uniquely suited to investigating the detailed effects 
of perturbations on the orbits of individual stars around the 
galactic center supermassive black hole (SMBH). 

We illustr ate this using the collisionally relaxed, multi- 
mass model of iHopman fc Alexander! (|2006j) (hereafter the 
"HA06 model"). This model has four components in addi- 
tion to the SMBH: main sequence (MS) stars, m = IMq; 
white dwarfs (WD), m = 0.6Mq; neutron stars (NS), m = 
1.4Mq; and stellar-mass black holes (BH), m = 1OM . The 
HA06 model was derived as a steady-state solution of the 
isotropic, orbit-averaged Fokker-Planck equation assuming 
a SMBH mass of 3 x 10 6 Mq, and including an approxi- 
mate term representing loss of stars into the tidal disrup- 
tion sphere. The contribution of the stars to the gravita- 
tional potential was ignored, making the solution valid only 
within the SMBH's influence radius, r < 1 pc. The three 
lighter species (which dominate the density at most radii) 
have p ~ r *, 1.4 < 7 < 1.5, r > 3mpc while the heavier 
BHs have a steeper profile, 7 ~ 2. Detailed density profiles 
for the four species were kindly provided by T. Alexander. 
We modified the HA06 model by imposing a steep trunca- 
tion like that of Figure \T\ to the density of each species be- 
yond r = 0.1 pc, then computed the self-consistent isotropic 
phase-space density fi(E),i — 1, 4 corresponding to each 
species from the truncated Pi(r) profiles using Eddington's 
formula. Finally, Monte-Carlo positions and velocities were 
generated from the pi and fi. The total number of objects 
of all types was found to be ~ 75, 000, with ~ 10 3 objects 
(mostly MS stars) within 0.01 pc; the latter number is consis- 
tent w ith the value given in Table 1 of lHopman fc Alexander! 
(2006). Hereafter, we refer collectively to stars and stellar 
remnants in the A'-body simulation simply as "stars." 

The non-zero masses of the stars cause the A^-body 
gravitational potential to deviate slightly from the fixed Ke- 
plerian potential of the SMBH, and the resultant perturba- 
tions cause the orbital elements of any single ("test") star 
to evolve. 

As test stars, we included five particles with orbital ele- 
ments corresponding to the five, shortest-period S-stars ob- 
served near the gal actic center: S0-1 , S0-2, SO-16, S0-19, and 
SO-20. The S-stars (|Ghez et al.ll2003l ; lEisenhauer et al.1l2005T) 
are a cluster of main sequence, O-B stars that orbit around 
Sagittarius A* with periods as short as 15 yr. Their short 
periods and their proximity to the supermassive black hole 
make them very interesting objects, e.g. for constraining the 
mass of the black hole or studying the effects of dynami- 
cal and/or relativistic perturbations on their orbits. In our 
model, the masses of these five S-stars were set to 15Mq 
and initial positions and velocities were determined at year 
2000 A D using the Kepler ian orbital elements given in Ta- 
ble 3 of iGhez et al.l (|2005r ). (Initial velocities of the S-stars 
were adjusted, at fixed a and e, to account for the slightly 
different values of Mbh assumed by Hopman & Alexander 
(2006) and Ghez et al. (2005).) These stars are being con- 
stantly monitored and deviations of their orbits from closed 
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Figure 7. Orbit of the galactic center star SO-2 over a time of 
100 yr in the case of r cr ; t = 2 mpc (blue) and r cr ; t = 12 mpc 
(red). The total number of stars in these integrations was 10 3 
and post-Newtonian terms were not included. 



Figure 8. Evolution of the orbital elements of star SO-2 in in- 
tegrations with N = 10 3 stars and r cr ; t = 2 mpc (blue) and 
r cri t = 12 mpc (red). 



Keplerian ellipses might be used to test various hypothe- 
ses about the distribution of mat ter near the SMBH , or as 
a test of general relativity (e.g. Fragile & Mathews 200fj; 
Rubilar fc Eckart||200ll; IWeinberg etaT. 2005; Zucker et all 



20061 ; I Willi l200cf ) 



The total number of stars in the HA06 model contained 
within the S-star orbits is > 10 3 , too large for all of them 
to be included in the chain at one time; this necessitates a 
choice of r cr it such that the S-stars will pass in and out of the 
chain in each orbit. We first verified that passage through 
r cr i t did not in itself introduce significant changes in the 
orbital elements. Figures [7] and [8] show the results of one 
such set of tests, which followed the orbit of SO-2 for two 
different values of r cr i t : r cr i t = 2 mpc and r cr i t = 12 mpc, 
compared with the semi-major axis length of 4.5 mpc. The 
total number of stars in this test was set to 10 3 in order that 
all particles within the largest r cr it could be included in the 
chain without exceeding a chain membership of 100. Post- 
Newtonian terms were not included. Figure [8] shows the evo- 
lution of the five classical elements of the Keplerian orbit of 
SO-2 over a time span of 10 3 years. There are only slight dif- 
ferences between the two runs, verifying that entrance into, 
or departure from, the chain does not significantly influence 
the orbit. 

The evolution of the semi-major axis and eccentricity 
of the all the S-stars is shown in Fig. [9] for an integration 
with N = 7.5 x 10 4 stars and r cr it = 0.8 mpc. While the 
eccentricity evolves more or less linearly with time for all 
the S-stars, the semi-major axis shows essentially a random 
evolution. 

The timescale for apoastron precession in the S- 
stars (represented b y Aui in Fig. [SJ) is > 10 5 yr (e.g. 
I Weinberg et al.l 120051 ). On shorter timescales, the angular 
momentum of stars like SO-2 should evolve approximately 
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Figure 9. Evolution of the semi-major axis and eccentricity vari- 
ation for the five S-stars in an integration with N = 7.5 X 10 4 stars 
and r cr it = 0.8 mpc. 



linearly with time due to the (essentially fixed) torques re- 
sulting from finite- N departures of the overall p otential from 
spherical symmetry (|Rauch fc Tremainelll99a ). 

This evolution is illustrated for all the S-stars in Fig- 
ure [10] The blue lines in that figure are from an integra- 
tion that used the complete set of N = 7.5 x 10 4 stars in 
the Monte-Carlo realization of the HA06 model, and r cr i t = 
0.8 mpc; also shown are integrations that used randomly- 
chosen subsets of 10 4 stars (r cr it = 2 mpc) and 10 3 stars 
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(r crit = 10 mpc) from this model. Plotted in Figure [10] are 
the two Keplerian elements (i, Q.) — (inclination, right as- 
cension of ascending node) that measure the orientation of 
the orbital planes. These angles would remain precisely con- 
stant in any spherical potential and their evolution is due 
entirely to finite- N departures of the potential from spher- 
ical symmetry. The two angles are related to the Cartesian 
components of the angular momentum via 

L x = L sin i sin f2, 



L y = — Lsinicosfi 



(15a) 
(15b) 

L z = Lcosi. (15c) 

Simple arguments (|Rauch fc Tremaind Il996l ) suggest 
that orbital inclinations should evolve in this regime ap- 
proximately as 

t 



A(i,ft) 



M. 



1/2 



A- 



GN 



2tt \M.a s 



1/2 



(16a) 
(16b) 



where m is a typical perturber mass, N is the number of 
stars within a sphere of radius a, the semi-major axis of the 
test star, and P(a) is the (Keplerian) orbital period. The 
coefficient A is of order unity a nd is believed to depend 
weakly on orbital eccen tricity ()Rauch fc Tremaind Il996t 
IGiirkan fc Hopmanll2007h . We evaluated equation (|16b)l nu- 
merically for the N = 75K model and found that the dom- 
inant contribution to the torques is predicted to come from 
the BH particles; the predicted change in orientation of a 



test star over 10 yr, for A = 1, is ~ 0.5° for 



10 mpc 



increasing to ~ 1° for a = 2 mpc. This is quite consistent 
with Figure [TOl if 2 < A < 3. We also do not see an obvious 
contradiction with the results of IGiirkan fc Hopmanl (|2007T ) 
with regard to the eccentricity dependence of A. However, 
a more detailed study is needed to give a definite answer 
to this question. The dependence of the evolution on N in 
Figure[l0]is also consistent with the N 1 ^ 2 prediction of equa- 
tion |[I5)|. 



4.2 Inspiral of an IMBH into the Galactic Center 

As a second test problem, we used </?GRAPEch to follow 
the inspiral of an intermediate-mass black hole (IMBH) into 
the Galactic SMBH. The multi-mass stellar cusp model de- 
scribed in the previous sub-section was again used, with 
N = 75K. The second black hole was given a mass of 10 -3 
times that of the SMBH, or 3 x 1O 3 M ; its initial orbit 
around the SMBH had semi-major axis 0.1 mpc and its ec- 
centricity was 0.9. This initial separation is of the same order 
as the so-called hard-binary separation an at which inspiral 
due to dynamical friction alone would be expected to stall 
(e.g. iGualandris fc Merrittll200Sl . eq. 4.1). The large eccen- 
tricity was chosen primarily to accelerate the inspiral. The 
integration used r cr i t = 0.8 mpc, rj — 0.01 and e = 10 -5 mpc 
and required a time of ~ 60 hr on one node of the GRAPE 
cluster. 

Figure [11] shows the time evolution of the distance be- 
tween the IMBH and the SMBH in this integration, and in 
a second integration in which ARCHAIN was used to fol- 
low the binary in the absence of stars. It can be seen from 
Fig. [11] that the stars have no significant effect on the rate 
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Figure 11. Evolution of the distance between the IMBH and 
the SMBH over the inspiral time. The blue points refer to the 
simulation containing TV = 75K stars while the red points refer 
to a simulation of the black hole binary in isolation. 



of inspiral. Figure [12] (top panel) shows the trajectory of the 
IMBH over a time of 4.5 yr, which roughly corresponds to 
the time for u) to precess 360° twice (bottom panel). The 
longitude of the periastron is predicted to advance by an 
amount 



Am » 0.15° (1 



3 x 10 6 M Q 



mpc 



(17) 



each orbital period; this corresponds to ~ 8° per revolution 
for a — 0.1 mpc and e = 0.9. 

In the absence of stars, evolution of the IMBH/SMBH 
binary would take place in a fixed plane, i.e. i and Q would 
remain constant. In the presence of stars, however, devi- 
ations of the potential from spherical symm etry cause th e 
orientation of the binary to change with time (|Merrittll2002l ). 
The evolution of the Keplerian elements of the binary can 
be seen in Figure 1131 which shows a substantial change in 
the binary's orbital plane during the course of the inspiral. 
Simi lar evolution has been observed in other A^-body studie s 
(e.g. iMilosavlievic fc MerrittllioOll ; iBaumgardt et al.l 2006). 

Also shown in this figure are the evolution of the semi- 
major axis and eccentricity of the binary. Results from 
i^GRAPEch are compared with numerical integrations of 
the coupled equations (Peters 1964) 



da 



■. G 3 Mi M 2 (Mi + M 2 ) 



5 c 5 



de _ 304 G 3 Mi M 2 (Mi + M 2 ] 



dt 



15 c 5 



Fie) 



G(e) 



where 
F(e) 

G(e) 



. , 2 x-7/2 / 73 2 37 4 

( '- e) fl + 24 e+ 96 e 



2N-5/2 / 121 2 



(18) 
(19) 

(20a) 
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Figure 10. Evolution of the orbital plane of the S-stars over a time span of 10 4 years in integrations with different particle numbers: 
N = 10 3 (magenta), N = 10 4 (green), TV = 7.5 X 10 4 (blue). The left panels show the inclination angle i while the right panels show 
position angle of the nodal point Q. 



The agreement is very good. 

Figure [14] shows the evolution of the orbits of the three 
innermost S-stars in this simulation. The periapse distance 
of SO-16 is 0.2 mpc, roughly equal to the initial apoapse dis- 
tance of the IMBH. This orbit evolves strongly due to inter- 
actions with the IMBH. The semi-major axis increases by a 
factor of ~ 4 and the eccentricity increases almost to one. If 
the inspiral were prolonged, e.g. by making the IMBH orbit 
less eccentric, Fig. [14] suggests that this star and star SO-2 
might be ejected completely aft er a few tens of thousands of 
years (|Mikkola fc Merritul2008l ). 

The binary ejects stars that interact strongly with it 
and such ejections a re a possible sour ce of the so-called 
hyper- velocity stars l|Brown e t al. 2006). Figure [151 shows 
the distribution of ejection velocities for stars unbound to 
the SMBH at the end of the IMBH inspiral. The peak of the 
distribution is at Vp Ca k ~ 300kms~ . Interestingly, about 



30% of the ejected stars have velocities > 700kms _1 , i.e. 
large enough to escape the bulge and reach the Galactic 
halo as hyper-velocity stars. About 150 stars are ejected 
during the inspiral, which results in an average ejection rate 
of ~ 20000 Myr -1 . This rate is somewhat higher than ob- 
served in simulations that start with a more separated bi- 
nary (e.g. lBaumgardt et al . 2006), presumably because most 
of the ejections we see are from stars on orbits that intersect 
the binary at time zero, and many of these stars would have 
been ejected at earlier times. Most of the escapers are main- 
sequence stars, which is not surprising given their dominance 
in the cluster. Nonetheless, a handful of stellar mass black 
holes are ejected with velocities up to 3000 km s -1 . 
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Figure 12. (Top) Trajectory of the IMBH over a time of 4.5 
yr, showing precession of the periastron due to the PN terms. 
(Bottom). Periastron advancement for the IMBH orbit. 



5 SUMMARY 

We have described a new A-body code, called (/jGRAPEch, 
designed for simulations of the central regions of galax- 
ies containing single or multiple supermassive black holes. 
Based on the serial implementation of i^GRAPE, the 
new code incorporates the algorithm ic chain regularization 
scheme of lMikkola fc Merrittl (2008) to treat orbits near the 
central black hole(s) with high precision. Post-Newtonian 
terms are included up to PN2.5 order. 

In performance tests, we find that the hybrid code 
achieves better energy conservation in less computation time 
when compared to the standard pure 4th-order Hermite inte- 
gration scheme. A simple performance model indicates that 
the computation time for particles in the chain will dom- 
inate the total computation time if the fraction of chain 
particles exceeds a certain limit. This limit is roughly 100 
chain particles for a total number of particles of one million. 

We then apply our new hybrid code to a model of the 
Galactic center that includes a supermassive black hole and 
four different stellar mass components. The five shortest- 
period S-stars were also included as test stars. We show that 
the orbits of S-stars are integrated with very high precision. 
For the first time, we were able to measure the change in 
the orbital plane of S-stars due the effect of the finite- N 
departures of the potential from spherical symmetry. The 
measured changes were in good agreement with theoretical 
predictions. 

As a second application, we added an inspiraling 
intermediate-mass black hole to our model of the Galactic 
center. The time scale of the inspiral was found to be unaf- 
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Figure 13. Evolution of the Keplerian elements of the black hole 
binary over the full inspiral time. The red lines in the upper two 
panels show the evolution of the semi-major axis and eccentricity 
obtained by integration of Eqs. I18l and 1191 
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Figure 14. Orbital elements of the S-stars SO-2 (blue), SO-16 
(red) and SO-19 (green) over a time of 6000 yr. 
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Figure 15. Velocity distribution for stars and stellar remnants 
escaping from the SMBH at the end of the IMBH inspiral. The dis- 
tribution for all escapers is presented in black, while the curves for 
individual species are: main-sequence (blue), white-dwarfs (red), 
neutron stars (green), black holes (magenta). 

fected by the stars, in agreement with theoretical predictions 
given the small initial separation. However, the orbital plane 
of the inspiral was affected by perturabations from the stars. 
A number of stars were ejected from the center with veloci- 
ties large enough to reach the Galactic halo as hyper- velocity 
stars. 
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